Subband{based Speech Recognition in Noisy Conditions: the Full Combination Approach Subband{based Speech Recognition in Noisy Conditions: the Full Combination Approach
نویسندگان
چکیده
In this report we investigate and compare di erent subband based Automatic Speech Recognition ASR approaches including an original approach referred to as the full combination approach based on an estimate of the noise weighted sum of posterior probabilities for all possible subband combinations We show that the proposed estimate is a good approximation of the ideal but often unpractical solution consisting in explicitly considering all possible subband subsets This approximation results in a nonlinear still simple and easy to implement combination function As opposed to other subband based approaches we believe that the proposed solution is more optimal mathematically correct and allows us to relax some of the subband independence assumptions Similarly to this full posterior combination approach which combines the subbands after independent processing a full feature combination approach is investigated in which all the possible subband features are orthogonalized and combined into a single feature vector before probability estimation The di erent approaches have been tested and compared on the Numbers database free format numbers with di erent levels of Noisex car noise This was done on the basis of two di erent acoustic features namely PLP and J RASTA PLP features and di erent weighting schemes Those experiments show that the full combination approximation yields very good estimates of the actual full combination posteriors and that both approaches yield very good recognition performance Acknowledgements The support of the OFES under the grant for the Speech Hearing and Recognition SPHEAR project OFES is gratefully acknowledged This paper bene ted from fruitfull discussions with my colleagues at IDIAP including Herv e Glottin Katrin Keller and Christopher Kermorvant as well as colleagues at other institutes such as Stephan Dupont at FPMs at Mons Belgium and Nikky Mirghafori and Brian Kingsbury at ICSI Berkeley USA
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملDifferent Weighting Schemes in the Full Combination Subbands Approach for Noise Robust Asr1
In this paper, we present and investigate a new method for subband-based Automatic Speech Recognition (ASR) which approximates the ideal ‘full combination’ approach which is itself often not practical to realize. The ‘full combination’ approach consists of explicitly considering all possible combinations of subbands [6] avoiding the usually necessary independence assumption, which would limit t...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملFeature Extracting in the Presence of Environmental Noise, using Subband Adaptive Filtering
In this work, a new feature extracting method in noisy environments is proposed. The approach is based on subband decomposition of speech signals followed by adaptive filtering in the noisiest subbbands of speech. The speech decomposition is obtained using low complexity octave filter bank, while adaptive filtering is performed using the normalized least mean square algorithm. The performance o...
متن کامل